Learning functions across many orders of magnitudes

نویسندگان

  • Hado van Hasselt
  • Arthur Guez
  • Matteo Hessel
  • David Silver
چکیده

Most learning algorithms are not invariant to the scale of the function that is being approximated. We propose to adaptively normalize the targets used in learning. This is useful in value-based reinforcement learning, where the magnitude of appropriate value approximations can change over time when we update the policy of behavior. Our main motivation is prior work on learning to play Atari games, where the rewards were all clipped to a predetermined range. This clipping facilitates learning across many different games with a single learning algorithm, but a clipped reward function can result in qualitatively different behavior. Using the adaptive normalization we can remove this domain-specific heuristic without diminishing overall performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The behavior of the reliability functions and stochastic orders in family of the Kumaraswamy-G distributions

The Kumaraswamy distribution is a two-parameter distribution on the interval (0,1) that is very similar to beta distribution. This distribution is applicable to many natural phenomena whose outcomes have lower and upper bounds, such as the proportion of people from society who consume certain products in a given interval.  In this paper, we introduce the family of Kumaraswamy-G distribution, an...

متن کامل

An Approach to Qualitative Radial Basis Function Networks over Orders of Magnitude

This paper lies within the domain of supervised learning algorithms based on neural networks whose architecture corresponds to radial basis functions. A methodology to use RBF when the descriptors of the patterns are given by means of their orders of magnitude is developed. A qualitative distance is constructed over the discrete structure of absolute orders of magnitude spaces. This distance is...

متن کامل

Image Classification via Sparse Representation and Subspace Alignment

Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...

متن کامل

Neural Episodic Control

Deep reinforcement learning methods attain super-human performance in a wide range of environments. Such methods are grossly inefficient, often taking orders of magnitudes more data than humans to achieve reasonable performance. We propose Neural Episodic Control: a deep reinforcement learning agent that is able to rapidly assimilate new experiences and act upon them. Our agent uses a semi-tabu...

متن کامل

Reinforcement Learning for Games: Failures and Successes CMA-ES and TDL in comparision

We apply CMA-ES, an evolution strategy with covariance matrix adaptation, and TDL (Temporal Difference Learning) to reinforcement learning tasks. In both cases these algorithms seek to optimize a neural network which provides the policy for playing a simple game (TicTacToe). Our contribution is to study the effect of varying learning conditions on learning speed and quality. Certain initial fai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1602.07714  شماره 

صفحات  -

تاریخ انتشار 2016